Allocating registers in multiple instruction-issuing processors
نویسندگان
چکیده
This work addresses the problem of scheduling a basic block of operations on a multiple instruction-issuing processor. We show that integrating register constraints into operation sequencing algorithms is a complex problem in itself. Indeed, while scheduling a forest of unit time operations on a processor with P parallel instruction slots can be solved in polynomial time, the problem becomes NP-hard when P is unbounded but only R registers are available. As a result we have devised a concise integer linear programming formulation of this scheduling problem that accounts for both register and instruction issuing constraints. This allows the use of oo-the-shelf routines to nd optimum solutions, which can then be compared with the results obtained by polynomial-time heuristics. Two such heuristics are given, and their combined results are shown to be optimal in 99.5% of the cases for trees of height at most 6. A byproduct of these experiments is to show that our integer programming formulation is quite practical as it can nd an optimum solution for a tree of height 6 in roughly 0.1 seconds on a sparc workstation. Allocation de registres pour les processeurs a lancement d'instructions multiples R esum e : Nous etudions le probl eme de l'ordonnancement d'un bloc de base d'op erations sur un processeur a lancement multiple d'instructions. Nous montrons que la prise en compte des contraintes de registres est un probl eme diicile per se, m^ eme en l'absence des contraintes de ressources li ees a un parall elisme born e. En eeet, alors que l'ordonnancement d'une for^ et d'op era-tions de dur ee unitaire sur un processeur a P instructions parall eles est un probl eme polynomial, le probl eme devient NP-diicile a P non born e, mais avec un nombre de registes R ni. Nous formulons ce probl eme par un programme lin eaire en nombres entiers, de complexit e limit ee, qui prend en compte a la fois les contraintes de registres et de parall elisme born e. Ceci nous permet de trouver la solution optimale en utilisant n'importe quel solveur disponible, a des ns d' evaluation d'heuristiques. Nous proposons 2 heuristiques, qui, combin ees, donnent la solution optimale dans 99,5% des cas pour tous les arbres binaires de hauteur inf erieure ou egale a 6. Les r esolutions par programmation enti ere se sont r ev el ees d'un int er^ et pratique tout a fait abordable, puisqu'en …
منابع مشابه
Scalar Program Performance on Multiple-Instruction-Issue Processors with a Limited Number of Registers
In this paper the performance of multiple-instructionissue processors with variable register le sizes is examined for a set of scalar programs. We make several important observations. First, multiple-instruction-issue processors can perform e ectively without a large number of registers. In fact, the register les of many existing architectures (16{32 registers) are capable of sustaining a high ...
متن کاملScalar Program Performance on Multiple - Instruction
In this paper the performance of multiple-instruction-issue processors with variable register le sizes is examined for a set of scalar programs. We make several important observations. First, multiple-instruction-issue processors can perform eeectively without a large number of registers. In fact, the register les of many existing archi-tectures (16{32 registers) are capable of sustaining a hig...
متن کاملRegister Spilling for Specific Application Domains in Application Specific Instruction-set Processors
An Application Specific Instruction set Processor (ASIP) is an important component in designing embedded systems. One of the problems in designing an instruction set for such processors is determining the number of registers is needed in the processor that will optimize the computational time and the cost. The performance of a processor may fall short due to register spilling, which is caused b...
متن کاملFloating accumulator architecture
Although technology advancement can pack more and more physical registers in processors, the numbers of architectured registers defined by the instruction set architectures (ISAs) remain relatively small on most modern processors. Exposing more architectured registers to compilers and programmers can improve the effectiveness of compiler optimization and the quality of code. However, increasing...
متن کاملInstruction Level Parallelism through Microthreading - A Scalable Approach to Chip Multiprocessors
Most microprocessor chips today use an out-of-order instruction execution mechanism. This mechanism allows superscalar processors to extract reasonably high levels of instruction level parallelism (ILP). The most significant problem with this approach is a large instruction window and the logic to support instruction issue from it. This includes generating wake-up signals to waiting instruction...
متن کامل